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A fully autonomous data reduction pipeline has been developed for FRODOSpec, an optical fibre-fed integral field spec- 
trograph currently in use at the Liverpool Telescope. This paper details the process required for the reduction of data taken 
using an integral field spectrograph and presents an overview of the computational methods implemented to create the 
pipeline. Analysis of errors and possible future enhancements are also discussed. 
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The Liverpool Telescope (LT. ISteele et al. 20041) is a 2.0 me- 
tre robotic telescope that is operating unattended at the Ob- 
servatorio del Roque de Los Muchachos Observatory on 
La Palma, Spain. Since robotic operations started in April 
2004, the LT has produced data for a variety of science 
programmes using software and instruments that were de- 
signed and developed in-house at the Astrophysics Research 
Institute of LJMU. Achieving first light in April 2009, the 
Fibre-fed RObotic Dual-beam Optic al Spectrograph (FRO- 
DOSpec, |Morales:£uedaetaL 2004 is the successor to the 
now decommissioned Meaburn Spectrograph and has been 
a common user instrument on the telescope since February 
2010. 

FRODOSpec is a bench mounted spectrograph with two 
optical paths, known as arms, that are utilised by separating 
the incident light around 5750A into two bandwidths using 
a dichroic beam-splitter. The light down each arm is colli- 
mated, dispersed and focused onto CCDs, with the elements 
of each optical chain separately optimised for blue and red 
hght. An optical schematic is shown in Figure lAl] 

Two dispersive elements are available for each arm: a 
conventional diffraction grating and a higher resolution Vol- 
ume Phase Holographic (VPH) grating. The VPH is bonded 
to a prism so that the light is dispersed at the same angle 
as the grating, requiring no parts other than the pneumatic 
stage they are mounted upon to be moved when selecting 
between them. Resolving power and wavelength ranges for 
each arm and dispersive element are shown in Table [T] 

Light is transmitted from the focal plane of the telescope 
to the spectrograph by a bundle of one hundred and forty- 
four optical fibres. At the telescope focal plane, the fibres 
are arranged in a regular pattern (see Figure IATI i to form a 
12 X 12 integral field unit (IFU), with each fibre coupled to 
a microlens to minimise light losses. Each fibre/microlens 
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covers a field of view on sky of ~ 0.83" x 0.83", corre- 
sponding to a total field of view of approximately 10" x 10". 
At the input of the spectrograph, the fibres are rearranged to 
form a pseudo-slit which acts as the entrance aperture. 

Despite the availability of instrument unspecific soft- 
ware packages to reduce data taken using an integral field 
spectrograph, the constraints of their generic design typi- 
cally limit the highest achievable degree of automation to ei- 
ther m anual like IRA F dValdes 1 992) or semi- automatic Uke 
R3D (Sanchez 2 0061). ku ngifu (IBolton & Buries 2007h and 
PSD (Sandin et a l. 2010l) . As these packages do not consti- 
tute an end-to-end system whereby data products can be 
produced without the need for human interaction, develop- 
ment of a bespoke pipeline to reduce FRODOSpec data was 
necessary in order that the following objectives could be ful- 
filled: 

- To autonomously produce a science-ready data product. 

- To autonomously produce a "quicklook" data product, 
allowing quick data quality assessment. 

- To have full quality control over the data products pro- 
duced. 

- To provide feedback for the automated LT scheduler. 

This paper details the second version of the pipeline, 
deployed in May 201 1, and is structured as follows. An ex- 
planation of the input data and the output data products is 
given in ^ ^outlines the computational methods used to 
process the data with error analysis. Key pipeline perfor- 
mance indicators are presented in ^ Concluding remarks 
and possible future enhancements to the pipeline are dis- 
cussed in ^ 

2 Overview 
2.1 Input data 

Data taken using an integral field spectrograph (see Figure 
[1) has two major differences to that of a traditional long-sIit: 
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Arm / Dispersive Element Wavelength Start Wavelength End Resolution Dispersion 

(A) (A) (A/px) 

Red Grating 5800 9400 2200 1.6 

RedVPH 5900 8000 5300 0.8 

Blue Grating 3900 5700 2600 0.8 

BlueVPH 3900 5100 5500 0.35 



Table 1 Wavelength ranges, resolving powers and dispersions for the different FRODOSpec arms/dispersive elements. 
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Fig. 1 Exposures of Xenon (left) and Tungsten (right) taken using FRODOSpec. The profiles of the fibres can be clearly 
seen in the Xenon arc frame. 



1 . The flux propagates spatially as a function of flbre 
profile. In order to describe the spatial distribution of 
flux, a measure of spatial fibre profile width, cr, and fibre 
separation, S, must be introduced. In the context of nu- 
merical analysis, the following quantities are therefore 
defined: 

a is the FWHM of the gaussian that best describes the 
spatial profile of a fibre. Although not perfect, a sin- 
gle gaussian profile is typically a good approxima- 
tion of the spatial flux distribution (see Figure |2|, 
with residual flux accounting for no more than 2% 
of the total flux of the distribution. As the width of 
the fibre profile varies with spectrograph focus, and 
the focus is dependent on the ambient temperature in 
proximity to the instrument, there is no unique value 
for a. 

5 is the spatial distance between adjacent fibre pro- 
file centroids, known as the spectral pitch. Due to 
fibre positioning errors within the pseudo-slit, there 
is also no unique value of S. 

To determine the distributions of cr and S, both quanti- 
ties were measured daily over a three week period for 
each dispersive element and arm (see Figures|3]and|4l). 



The Starlink package, Figaro, was used to measure cr, 
with fibre profile centroids calculated by the pipeline 
(see ^3.1. It used to measure S. The optimum focal posi- 
tions of the CCDs were maintained weekly by remotely 
driving the electronic translation stages upon which they 
are mounted, limiting the timescale over which the fo- 
cus was allowed to drift. 
2. The flux from the slit is spatially incoherent. As the 
two-dimensional fibre matrix at the IFU input must be 
rearranged into a one-dimensional slit at the output, the 
flux from adjacent fibres may originate from different 
positions on the focal plane. 

A complication arising from the combination of these 
two differences is fibre cross-talk (see ^3.2.11 ). 

2.2 Output data products 

Data taken by FRODOSpec is reduced by two sequentially 
invoked pipelines. The first pipeline, known as the LI, is 
a CCD processing pipeline that performs bias subtraction, 
overscan trimming and CCD flat fielding. This paper fo- 
cuses on the second pipeline, known as the L2, which per- 
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Fig. 2 The spatial flux distribution modelled using sin- 
gle gaussians to represent the fibre profiles. The solid and 
dashed lines in i) represent the data and model respectively. 
The residual flux in the wings of profile cannot be accounted 
for by a single gaussian, but accounts for no more than 2% 
of the total flux distribution. 



forms the processes specific to the reduction of data taken 
using an integral field spectrograph. 

The science-read y data product is an eight part multi- 
extension FITS (Hani sch et al. 200 lb file with each exten- 
sion containing a snapshot of the data taken at key stages of 
the reduction process. The lowest tier of reduction product 
available to the user is the LI image. The output format is 
shown in Table |2l 



In addition to the science-ready data product, a com- 
posite raster image of the L1_IMAGE, SPEC_NONSS and 
COLCUBE_NONSS extensions is made available through 
the LT archive websit^H An example is shown in Figure lBl] 



2.3 Coding platform 

The pipeline has a command-line interface (CLI), consist- 
ing of a series of progra ms written in C u sing the GNU 
Scientific Lib rary (GSL. iGalassi et al. 20091) and CFITSIO 
jPence I999I) hbrary. 



The compiled binaries are linked through scripts written 
in TCSH, with the reduction image preview scripts requir- 
ing a combination of ImageMagick and GNUplot. 
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3 Reduction Method 

L2 reduction largely follows the process that has been pre- 
viously discussed in the development of both instrument 
specific software pack ages for spectrographs like VIMOS 
dZanichelli et al. 2005h . as well as unspecific ones Uke R3D, 
kungifu and P3D. 

Specifically, it consists of i) finding and tracing the po- 
sitions of the fibres at points along the dispersion axis, ii) 
standard aperture extraction, iii) wavelength calibration, iv) 
fibre throughput correction, v) spectral rebinning to a linear 
wavelength calibration, vi) identification and subtraction of 
sky-only fibres (only possible if they are available in the 
field) and vii) reformatting of data to desired output format 
(see ^2.2l l. A flow chart illustrating the reduction process is 
shown in Figure [CT] 

No attempt has been made to either remove the contam- 
ination from scattered light and cosmic rays, or to correct 
for differential atmospheric refraction (DAR). A synopsis 
of each of these is given below. 

Scattered light is the general term given to light scat- 
tered off component optical surfaces of an instrument that 
has not followed the desired optical path, and can be identi- 
fied in data from a fibre-fed spectrograph as a smooth back- 
ground between the spatial profiles of the fibres. Removal of 
scattered light can only be reliably achieved if the spectral 
pitch is sufficiently large that a clean background can be in- 
terpolated spatially between the fibre profiles. Although the 
flux contribution from scattered light is dependent on fibre 
illumination, with larger values for longer exposure times 
and brighter targets, it has been estimated to contribute no 
more than 2% of the total flux for any given FRODOSpec 
observation (see ^2. lb . 

Automated removal of cosmic rays from spectrographic 
data is too unreliable to b e implemented using current meth- 
ods suc h as L.A. Co smic (Ivan Dokkum 200 ih and the DCR 
routine (Pvch 2004V The accuracy of the removal is depen- 
dent on the location of the cosmic ray, with those lying close 
to strong emission lines particularly hard to distinguish and 
remove cleanly without the iterative manual process of pro- 
gram parameter tweaking and inspection of results. 

The extent to which DAR affects an observation can be 
assessed by measuring the shift in target position at the fo- 
cal plane for different waveleng ths, and is worse for larger 
zenith angles (IFilippenko 1982b and observations where the 
target is required to be tracked by the telescope for long pe- 
riods. DAR must be corrected for by the user if spectropho- 
tometric accuracy is requir ed. A method for its correction in 
IFS has been documented ( Arribas et al. 19991) . 

The L2 requires three frames to proceed: a Xenon arc 
frame, a continuum frame (typically an exposure of a Tung- 
sten lamp) and the target, or science, frame. 



3.1 Constructing the tramline map 

As the true cross-dispersion axis, x, and dispersion axis, 
A, are only approximately aligned with the corresponding 
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Fig. 3 The distribution of the focus measure, a. The hollow histograms show the distribution of the entire sample, while 
the filled histograms show the configuration specific distribution for the: a) red grating, b) red VPH, c) blue grating and 
d) blue VPH. The dotted lines represent the average for the whole sample (2.14 pixels), while the dashed lines show the 
averages for the different dispersive elements and arms. (2.21, 2.06, 1.96, 2.33 pixels respectively). The broader form of 
the blue VPH distribution is indicative that the focus is non-uniform along at least one axis. 



pixel axes of the CCD, it is required that a trace, or tramline 
map, be made to characterise how the flux from each fibre 
propagates along both axes of the CCD. This is done by de- 
termining the relationship between CCD pixel coordinates 
and the peak location of each fibre profile taken at intervals 
along the dispersion axis. 

As a target frame will typically have insufficient signal 
through each fibre over the wavelength range required, an 
exposure of a continuum lamp is used. To minimise the ef- 
fects of shifting spectral positions on these traces due to 
temperature changes, continuum flats used by the L2 are 
taken nightly using a Tungsten lamp. 

3.1.1 Finding the fibre profile peaks 
(frfind) 

The first of the pipeline procedures, frfind, is a simple peak 
finding routine that reads the CCD output row by row along 
the dispersion axi^ considering each pixel on the cross- 



The literal distinction between the true dispersion/cross-dispersion 
and actual CCD axes is ignored in the following discussion. 



dispersion axis in turn and flagging a pe ak only if the fol- 
lowing criteria are all satisfied (cf. .Sanchez 20061) : 
1 . A specified aperture either side of and spatially contigu- 
ous to the current pixel being considered all have fewer 
counts. 

Let be the intensity of the currently considered 

pixel i along the dispersion axis, and j the pixel along 
the cross-dispersion axis locating the peak number k 
such that determination of the fc*'* peak verifies: 

IiiJ)>IiiJ + l) 
I{i,j)>I{i,j + 2) 

I{i,j)>I{i,j + {n-l)/2) 
and 

I{i,j)>I{i,j-l) 
I{i,j)>I{i,j-2) 
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Fig. 4 The distribution of the spectral pitch measure, 6. The hollow histograms show the distribution of the entire sample, 
while the filled histograms show the configuration specific distribution for the: a) red grating, b) red VPH, c) blue grating 
and d) blue VPH. The dotted lines represent the average for the whole sample (7.03 pixels), while the dashed Unes show 
the averages for the different dispersive elements and arms (6.98, 6.97, 7.08, 7.08 pixels respectively). The bimodal form 
of the plots is a natural consequence of the misalignment of fibres in the pseudo-slit. If two fibres exhibit a smaller than 
average inter-fibre distance, there should generally also exist a larger than average inter-fibre distance depending on the 
position of the two fibres in the pseudo-sUt. 



>/(i,i-(n-l)/2) 

where n is the aperture size in pixels. 
A sensible restriction on n exists, in that it should be less 
than the spectral pitch. Additionally, as the aperture size 
is defined by a whole number of pixels, the L2 requires 
that n be an odd integer. 

2. The pixel distance to the previous peak is greater than a 
pre-specified minimum distance. This criterion is omit- 
ted when the first peak in the row is being considered. 

jk — jk-i > minimum distance 

In order that all fibres can be identified, the value of the 
minimum distance is set less than the minimum inter- 
fibre distance. 



3. The currently considered pixel value is greater than the 
value at a pre-specified pixel distance either side by a 
minimum amount. The optimum value of this amount is 
determined automatically by the routine by cycling be- 
tween pre-specified limits until the maximum number 
of rows with the correct number of peaks, equal to the 
number of fibres, is found. 

A*5 j) ~ j ~ I) > minimum amount and 
— + 1) > minimum amount 

where I is the pre-specified cross-dispersion pixel dis- 
tance to be checked either side of a candidate peak. 



Using the GSL least- squares polynomial fitting algorithm, 
sub-pixel peak locations are determined by finding the po- 
sition of the maximum, j^*'', of a parabola fitted to the flux 
of three pixels at locations along the cross-dispersion axis 
of jk - ^,jk and jfc -I- 1. 
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HDU Index EXTNAME Format Wavelength Throughput Sky Subtracted? 

Cahbrated? Corrected? 






LIJMAGE 


Image 








1 


RSS_NONSS 


RSS 


/ 


/ 




2 


CUBE_NONSS 


Datacube 


/ 


/ 




3 


RSS_SS 


RSS 


/ 


/ 


/ 


4 


CUBE_SS 


Datacube 


/ 


/ 


/ 


5 


SPEC_NONSS 


Spectrum 


/ 


/ 




6 


SPEC_SS 


Spectrum 


/ 


/ 


/ 


7 


COLCUBE_NONSS 


Image 




/ 





Table 2 The format of the science-ready data product. Extensions can be accessed through either their HDU index 
or corresponding EXTNAME key. Row Stacked Spectra (RSS) frames are used to display each extracted spectrum as a 
single row of height one pixel. Datacubes reimage the focal plane at each wavelength using the IFU input to output head 
mappings (two spatial axes, x and y, and the dispersion axis, z). Datacubes are a standard reduction product for integral field 
spectroscopy (IPS) and many software reduction packages have tools to visualise and manipulate them. The final extension 
is an IFU fibre matrix image of the focal plane, which is the CUBE_NONSS extension with a collapsed dispersion axis. If 
sky subtraction (SS) is unsuccessful, the corresponding HDUs (3,4 and 6) will be blank. Wavelength is cahbrated in units 
of A. 



The routine, and consequently reduction, is aborted if 
the number of rows with the correct number of peaks is less 
than a pre-specified value. 



3.1.2 Cleaning erroneous entries from the peak list 
(frclean) 

The second routine, frclean, is used to remove rows with 
an appreciable likelihood of containing peaks that have ei- 
ther been incorrectly classified or have poorly determined 
locations. These rows are identified by cycling through the 
peaks for each row in the peak list generated by frflnd, and 
calculating the pixel distance between the cross-dispersion 
location of the currently considered peak and the average 
cross-dispersion location for that peak number, j^*-"^, deter- 
mined using all rows. Any peak in a particular row that has a 
distance larger than a pre-specified maximum distance will 
restrict all peaks in that row from further use by the pipeline. 



\l 



-:td 



The maximum difference parameter is set to allow for the 
inherent spectral curvature and rotation. 



3.1.3 Fitting polynomials to the fibre profile peaks 
(frtrace) 

The remaining peak locations in the peak list generated by 
frclean are binned along the dispersion axis for each fibre. 
The average cross-dispersion locations of the peaks con- 
tained within each bin are then calculated and polynomial 
fits to these positions and the bin centroids are made using 
the GSL least-squares polynomial fitting algorithm. Figure 



HJillustrates examples of these fits for fibre number 78, cen- 
tral within the pseudo-slit and representative of the other 
fibres in the bundle. 

The quality of the fit depends largely on the number of 
bins, or bin width, specified. A smaller bin width increases 
the likelihood of empty bins, where no peaks can be found 
between the bin limits. Conversely, a larger bin width re- 
duces the number of coordinates used in the fitting process. 
As the curvature is well defined by low order polynomial 
fits (see Figure |5|i, larger bin widths are most suitable for 
this routine. 

3.2 Extraction of flux 
(frextract) 

A standard aperture extraction is performed using the traces 
determined by frtrace. In standard aperture extraction, aU 
pixels within an aperture centered on the trace centroid are 
assumed to have equal statistical weight. Aperture bound- 
aries that lie inside pixels are accounted for by adding the 
fractional flux from the corresponding pixel. 

For a raw image, C, the data values, D after proce ssing 
by the LI CCD pipeline are given by (cf . iHorne 19861) : 



xX 



where B is the master bias image and F is the flat field 
image. The integrated flux for each spectrum, and vari- 
ance, var", between the spatial aperture limits xi and X2 for 
a wavelength A is: 



fx = E 
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Fig. 5 Example polynomial traces using fibre number 78 for the a) red grating, b) red VPH, c) blue grating and d) 
blue VPH configurations. The blue and red hatched regions represent areas on the CCD for which the pixels have a 
corresponding wavelength that is outside of the final wavelength calibration. The solid line represents a fit to the data of 
order 2, a dashed line of order 3 and a dotted line of order 4. For the wavelength ranges required, a polynomial order of 2 
was found to be suitable for all configurations but the red grating, where an order of 3 was found to better account for the 
curvature towards the blue end of CCD. Using these orders, the root mean square (RMS) discrepancies between the fit and 
data used in the final wavelength calibration were 0.038, 0.053, 0.047 and 0.053 pixels respectively. 



var^ = ^ V^x 

x—xi 

with the variance on an individual pixel, Vx\, approxi- 
mately given by: 

where r is the readout noise (e~), assumed to be con- 
stant across the CCD, and G is the gain of the CCD (e^ / 
ADU). As a large number of frames are stacked to gener- 
ate both the master bias and flat field frames, their resulting 
error contribution is assumed to be neglible. 

The L2 uses a 5 pixel extraction aperture, chosen as a 
comprimise between minimising fibre cross-talk and max- 
imising the amount of flux recovered from the fibre profile. 



3.2.1 Fibre cross-talk 

Cross-talk occurs when the profiles of the fibre overlap spa- 
tially. The severity of cross-talk is dependant on a variety of 
factors including a and S. 

To quantify this effect for FRODOSpec, an approximate 
two aperture analysis is used. Each aperture has a width in 
pixels of 7, with the centre of the apertures separated by the 
spectral pitch, 6. A fibre is placed with centroid at x = 5 
and the corresponding flux, fcT, recovered by an aperture 
centered at x = is calculated. As previous, the fibre profile 
is modelled using a single gaussian such that: 

/7/2 — 1 £_ 
e 2cr2 dx 
-7/2 

which is then represented as a percentage of the total 
flux. Figure |6] shows how this cross-talk quantifier varies 
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for different S, cr and 7. Figure |7] shows how the flux recov- 
ered by an aperture centered on the fibre centroid varies for 
different 6 and 7. 
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Fig. 6 Percentage of total flux contained in an adjacent 
aperture for varying a, S and 7. The dotted, solid and dashed 
Unes represent a 7 pixel, 5 pixel and 3 pixel extraction aper- 
ture respectively. 



To assess if the effect of cross-talk must be taken into 
account, two cases representing the worst (cr = 3.5px, S = 
6.5px) and average (cr = 2.14px, 6 = 7.03px) data conditions 
are presented for a 5 pixel extraction aperture. 

In the worst case, ^^90% of the flux is recovered with a 
maximum of 0.4% of the flux contained singly within ad- 
jacent apertures. In the average case, more than 99% of the 
flux is recovered with less than 0.1% of the flux contained 
singly within adjacent apertures. 

In order to justifiably apply these figures to real data, 
an additional consideration must first be made regarding the 
ability of the pipeline to accurately locate and trace the po- 
sition of each fibre within the data. For this to be estimated, 
it is required that the true peak locations be known a priori. 
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Fig. 7 Percentage of total flux recovered in the aperture 
for varying cr and 7. The dotted, solid and dashed lines rep- 
resent a 7 pixel, 5 pixel and 3 pixel extraction aperture re- 
spectively. 



Information regarding the spectral flux distribution, the 
order of tracing polynomial, the spectral pitches and the 
spectral and spatial FWHM distributions was used to gener- 
ate simulated data for each configuration. This information 
was determined using the L2 intermediate reduction prod- 
ucts. 

After processing the simulated data using the pipeline, 
the maximum and average difference values were calculated 
by subtracting the known peak locations from the locations 
determined using the tracing coefficients. The results are 
shown in Figure |8] 

As the discrepancies within the final wavelength cali- 
brations are small, centroiding errors introduced by locat- 
ing (frfind) and tracing (frtrace) the fibres can be ignored 
and the previous flux extraction and spatial cross-talk esti- 
mations are considered accurate. Consequently, cross-talk 
is considered a negligible effect for FRODOSpec and here- 
after not considered. 



3.3 



Arc fitting 
(frarcfit) 



The L2 automatically attempts to fit dispersion solutions. 

Spectra in the arc RSS frame are first cross-correlated to 
remove zeroth order offsets arising from both optical distor- 
tions and small errors in alignment of the fibres within the 
slit. This removes the gross curvature and aligns the spectra 
to ±lpx. The cross-correlation is performed using a limited 
window size, decreasing execution times and reducing the 
influence of bad data on the determination of the offset. An 
example of this process is shown in panels i) and ii) of Fig- 
ure|9l The routine then indentifies candidate lines in the data 
from each spectrum using the criteria outlined in ^3.1.1l and 
a few additional constraints described below. 

A candidate line is defined as a cross-spectrum set of 
spatially contiguous peaks. Spatial contiguity is determined 
by considering each peak in the distribution of a single spec- 
trum of the RSS frame, and checking to see if there exists a 
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Fig. 8 Centroid difference images using simulated data for each configuration. Differences were calculated by subtract- 
ing the known peak locations from the centroids determined using the polynomial tracing coefficients, and are therefore 
representative of the conjoint error in centroiding (frfind) and tracing (frtrace). The hatched regions represent areas on the 
CCD for which the pixels have a corresponding wavelength that is outside of the final wavelength calibration. Maximum 
and average difference values within these calibrations (SUBSET) and for the entire range of the CCD (ALL) are shown 
for each figure. As expected, the largest discrepancy is present in the red grating data where a significant fraction of the 
CCD, the majority of which lies outside of the final calibration, is insufficiently well illuminated to determine an accurate 
centroid. 



peak at the same location along the dispersion axis, within 
a pre-specified tolerance, for all remaining spectra. 

Candidate lines are checked against a reference arc line 
list, which contains information on the wavelength of the 
emission lines and their approximate pixel position. In order 
for an identified candidate line to be associated with one 
from the list, the distance between the two must be less than 
that of a pre-specified number of pixels. 

As incorrect line association significantly affects the ac- 
curacy of the fits, both the peak-finding criteria and refer- 
ence arc lines are selected so as to limit the number of detec- 
tions and associations made to only those that are deemed 
most suitable for the purpose of automatic arc fitting. Suit- 
ability is determined by careful inspection of the data, in 
order that weak, saturated and crowded lines are not used 
in this process. It should be noted that a failure to find one 
of the lines is an indicator that the nature of the data has 



changed significantly since the reference arc line list was 
made, the most likely cause being a spectral shift due to 
substantial temperature changes (the L2 can tolerate small 
shifts). 

Once all candidate lines have been identified, two fur- 
ther checks are made to ensure the arc fitting routine pro- 
duces a reliable and consistent solution: 

1 . The number of lines matched must be greater than a pre- 
specified number. 

2. The RMS wavelength distance between identified lines 
and the RMS wavelength distance between lines from 
the reference arc line fist must be within a pre-specified 
amount. 

The joint success of these criteria ensures that the L2 
identifies a reasonable quantity and spread of lines taken 
from within the entire wavelength range. If either of these 
criteria are not met, the reduction is aborted. 
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Fig. 9 The arc frame at various points in the arc fitting and rebinning process. The top panels show the arc RSS frames, 
while the bottom panels show the maximum difference along the dispersion axis between peaks in the arc RSS frame for a 
single emission line, i) The gross spectral curvature (^3px) is evident before the frame is processed but this is removed in 
ii) by cross-correlating a single spectrum with the others, reducing the maximum difference to < Ipx. Dispersion solutions 
are then found for each spectrum in the RSS frame and the flux rebinned to a linear wavelength solution in iii). Post-rebin 
maximum differences for the 3 week dataset were calulated for the red grating, red VPH, blue grating and blue VPH and 
found to be 0.23±0.06, 0.29±0.31, 0.28±0.22 and 0.32±0.44 pixels (or 0.37±0.1, 0.23±0.25, 0.22±0.18, 0.11±0.15 A) 
respectively. 



On success, a set of linear equations in a pixel dimen- 
sion, X, and wavelength. A, are solved using the GSL least- 
squares polynomial fitting algorithm. Suitable fitting orders 
were determined by manually finding the dispersion solu- 
tions using the arc routine in the Figaro package, and select- 
ing an order that minimised the average line residual RMS 
without introducing too many free parameters (see Figure 



3.4 Throughput correction 
(frcorrectthroughput) 

Fibre defects and misalignments during positioning in the 
pseudo-slit cause the throughput to vary from fibre to fibre. 
To correct for this, normalisation coefficients are applied to 
each data value in the target RSS frame. These coefficients 
are determined by division of the total flux of the median 
spectrum by the total flux of each spectrum in the continuum 
RSS frame. 

Average normalisation coefficients for the three week 
dataset are shown in Figure[TT] The percentage relative stan- 
dard deviation (%RSD) was also calculated using all con- 
tinuum frames and averaged for all wavelengths along the 
dispersion axis that were within the post-rebin wavelength 
calibration Umits. %RSD is the absolute value of the coeffi- 
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Fig. 10 Average order-dependent RMS line deviations for 
the red grating (circle), red VPH (triangle), blue grating 
(upside down triangle) and blue VPH (square) configura- 
tions. Axes ranges have been selected to highlight the RMS 
line deviation differences between orders 2 through 6. Filled 
symbols represent the final orders chosen for each configu- 
ration. 



cient of variation, expressed as a percentage, i.e. 



%RSD = - X 100 

X 
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Fig. 11 Average throughput normaUsation coefficients calculated using Tungsten RSS frames for each arm and dispersive 
element. The graphs have been offset from each other by unity for clarity. Fibres 58 and 60 are known damaged fibres, and 
are corrected for by allocation of larger coefficients by the frcorrectthroughput routine. The gradual curvature in the blue 
data may be indicative of vignetting in the optics, with larger throughput coefficients being applied to fibres at the ends of 
the pseudo-slit to compensate. 



The averaged %RSDs before and after throughput cor- 
rection are shown in Table |3] 



Arm / Dispersive Element RSD (Before) RSD (After) 



(%) (%) 



Red Grating 12.3 9.7 

Red VPH 9.6 6.2 

Blue Grating 11.8 2.6 

Blue VPH 12.8 5.9 



Table 3 Average %RSDs determined using Tungsten 
RSS frames before and after throughput correction. The 
routine is less effective at normalising the red arm through- 
puts due to the CCD fringing pattern, which introduces an 
appreciable spatial dependency to the total fibre fluxes in 
well illuminated continuum frames. 



3.5 Spectral rebinning 
(frrebin) 

In order to add together flux from different spectra, it is 
first required to apply a single wavelength solution to all 
the spectra. This is done by rebinning the flux along the dis- 
persion axis for each spectrum in the target RSS frame. 



Using configuration specific wavelength start, end and 
pixel scale parameters (see Table [T]|, the value of the re- 
binned flux at determined wavelength intervals is calculated 
by linearly interpolating the flux between the two straddling 
wavelength values. Flux is conserved by calculating the to- 
tal fluxes in each spectrum before and after rebinning, and 
applying a conservation factor Example output is shown in 
the panel iii) of Figure|9] RMS line residuals using a manual 
and automatic fit are shown in Figure [111 

3.6 Sky identification and subtraction 
(fridsky) 

As FRODOSpec's IFU has a small field of view, a routine 
that attempts to mask the target and interpolate a sky back- 
ground would be unreliable as a) the target may be located 
toward the edges of the IFU and b) the target may be ex- 
tended, making the interpolation ill-defined. The L2 does 
attempt sky subtraction, but instead uses a sky-only fibre 
dataset to calculate the median flux at each wavelength. This 
contribution is then subtracted off all the spectra in the tar- 
get RSS frame, leaving the target-only flux. 

To construct the sky-only fibre dataset, the fluxes con- 
tained within each spectrum in the target RSS frame are 
summed along the dispersion axis to produce a dataset, X, 
containing the total fluxes. It, of each fibre. An iterative 
cr-clipping algorithm is then used to remove fibres contain- 
ing target flux from the dataset to create a sky-only fibre 
dataset, XI, for each iteration i where XI C X. For a fibre 
to qualify as containing target flux within an iteration, its 
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Fig. 12 RMS line residuals for all known lines using a manual and automatic fit for the a) red grating, b) red VPH, c) blue 
grating and d) blue VPH configurations. The hatched regions represent areas where the automatic fit is an extrapolation 
(due to frarcflt only using a selection of the known available lines). The hollow triangles represent the residuals of the lines 
determined using the manual fit, while the filled circles represent the residuals of the lines determined using the automatic 
fit. The RMS line residuals for the manual fit were 0.26, 0.10, 0.22 and 0.07 A respectively. The RMS line residuals for the 
automatic fit were 0.27, 0.08, 0.28 and 0.19 A respectively. Residuals for lines that could not be accurately centroided are 
omitted from these plots. 



total flux must be greater than ni sigma from the mean of 
the determined sky-only fibre dataset for the iteration, : 

It >Xl + nia[Xl] 

With the first iteration, X^, proceeding as X^ ~ X, 
the process is repeated using each resulting dataset until no 
more fibres are identified as containing target flux and the 
final iteration of the sky-only dataset, X^ , is found. The L2 
uses a global detection limit of ni ~ 2. 

Aside from a check to ensure that X/ X, a com- 
parison between the mean of the final iteration of the sky- 
only fibre dataset, xi , and the median of the complete fibre 
dataset, X, is made to check if the average total sky back- 
ground flux is statistically similar to the total flux of the 77th 
and 78th brightest fibres: 

X7 - n2(j[Xf] < X < X7 + n^aiXf] 



where the value of rt2 selected determines to what de- 
gree the two values must be statistically similar. It is cur- 
rently set to 1 . 

These checks serve as a catch for cases where a) there is 
no target in the IFU b) the source is extended such that the 
flux is invariant across the entire IFU and c) the observation 
of the target fills more than 50% of the IFU. Cases a) and b) 
are computationally identical. 

To assess the effectiveness of the routine, exposures of 
the blank night sky were taken using all configurations. A 
modified version of the routine was used so that fibres from 
half of the IFU were forcibly assigned as sky and half were 
assigned as target. Bypassing the usual checks, analysis was 
then carried out on the faux target fibre dataset. The results 
£tre shown in Figure [131 01 line emission and OH bands 
(iMeinel 1950l) have been marked. 
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Fig. 13 Median non sky subtracted and sky subtracted spectra of the blank night sky for the a) red grating, b) red VPH, 
c) blue grating and d) blue VPH configurations. The flux from strong sky emission lines, such as the OI (6300A) line in 
the spectrum of the red VPH exposure, has been oversubtracted. The non-zero offset in the blue configurations is due to 
scattered light, which introduces a small (<1 ADU/600s) non-uniform flux ramp across the CCD. 



3.7 Formatting of final data product 
(frreformat) 

Using intermediate non sky subtracted and sky subtracted 
target RSS frames, the science-ready data product is assem- 
bled in accordance with the format shown in Table |2] In 
the one-dimensional spectra, only the flux from the top n 
brightest fibres is summed. To reduce the influence of cos- 
mic rays when identifying the brightest fibres, each spec- 
trum in the target RSS frame is individually smoothed along 
the dispersion axis using a median 10x1 pixel boxcar filter. 
The summation is then done using the unsmoothed data. 

For observations of point sources, a smaller value of n 
reduces the contamination from cosmic rays, as well as gen- 
erally improving the S/N. For larger values of n, the total 
flux recovered is greater. The percentage of the total flux re- 
covered by a subset of fibres is dependent on where the cen- 
troid of the target PSF is located on the IFU. The two limit- 
ing cases are where i) the PSF centroid is located at the cen- 
tre of a fibre and ii) the PSF centroid is located exactly be- 
tween four fibres. Under the assumption of a gaussian PSF, 



the percentage of the total flux recovered in i) will generally 
recover more flux than in ii), except when n — {4, 5, 6}. 

As is shown in Figure [141 a summation of the flux from 
five fibres (n = 5) is a good comprimise, as the flux from a 
point source will be contained (~97.5% of the total) under 
average LT seeing conditions of 0.8" — 1.3" regardless of 
where the centroid is located on the IFU. 

For observations where spatially resolved spectroscopy 
is required, such as the case for extended sources, the one- 
dimensional spectra may not be a useful data product. 

4 Performance 

4.1 Success rates 

The second version of the L2 is now fully integrated into 
the daily automated FRODOSpec reduction process. Two 
types of failure have been seen to occur, with the following 
important distinction between them: 

- A partial failure is deemed as a failure that still produces 
an L2 output file. By far the most common partial fail- 
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experienced a critical failure (^0.7% failure rate) whilst 205 
experienced a partial failure (^36.3% failure rate). 

It is prudent at this point to reiterate that a partial fail- 
ure resulting from unsuccessful sky subtraction does not im- 
pede the processing of the output data products. As such, the 
more accurate indicator of reduction success is the success 
rate of the pipeline subject to critical failure only. That is to 
say, 99.3% of observations taken with FRODOSpec will be 
reduced by the L2 to an extracted, throughput corrected and 
wavelength calibrated spectrum. 

4.2 Execution times 

The speed at which the pipeline executes is reliant on the 
specification of the system under which it is being executed 
and the load the processor is experiencing. L2 reduction oc- 
curs on the LT Archive machine, which has four Intel Xeon 
3.20GHz processors and 4GB of RAM. Shown in Table g] 
are the average execution times for this machine taken from 
the 3 week dataset. 
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Fig. 14 Percentage of the total flux recovered by n fibres 
for different seeing conditions under the assumption of a 
gaussian PSF. This quantity is dependent on the location 
of the PSF centroid on the IFU. The solid line represents 
the case where the centroid is at the centre of a fibre, and 
the dashed line represents the case where the centroid lies 
exactly between four fibres. 



Arm / Dispersive Element Time Frames 

(seconds) 



Red Grating 
Red VPH 
Blue Grating 
Blue VPH 



32.6 
32.2 
37.4 
36.4 



61 
235 
60 
208 



Table 4 Average FRODOSpec execution times for each 
dispersive element and arm. 



5 Conclusions 



ure to occur so far is unsuccessful sky subtraction. It 
is worth noting that although this may be considered a 
failure in the reduction process, it is by no means the 
result of an error in the pipeline and is actually the re- 
sult of constraints placed in the sky subtraction routine 
(frsubsliy) to ensure that it only occurs when considered 
statistically sensible (see ^3.61 ). 
- A critical failure is deemed as a failure that halts the pro- 
duction of L2 output data products. Such a failure may 
result from the inability of the pipeline to match enough 
candidate lines to lines in the corresponding reference 
arc line list (see 33.31 ). 

All data taken between 17/06/201 1 and 01/08/201 1 has 
been processed using the L2, allowing success rates to be 
determined for this period. Of the 564 frames reduced, 4 
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This paper has explained the characteristics of data taken 
with FRODOSpec, and has presented a brief overview of 
the computational methodology used to develop a fully au- 
tonomous pipeline to reduce it. Throughout the paper, repre- 
sentative data taken using each dispersive element and arm 
was used to assess the magnitude of errors and how they 
propagate through the pipeline. The effect of cross-talk has 
been discussed, and its contribution found to be negligible. 
The L2 output data products have also been described, along 
with some key performance indicators. 

The L2 is in a state of continued development. Possible 
future enhancements include the optimal extraction of flux, 
which has been shown to increase S/N with a correspond- 
ing maximum increase in effective exposure time of up to 
70% (H orne 198 6). All enhancements and addenda will be 
published on the instrument website at |http : //telescope . 
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A FRODOSpec Optical Schematic 

A FRODOSpec optical schematic is shown in Figure lAll 

B FRODOSpec L2 Reduction Preview 
Example 

An example L2 reduction preview image is shown in Figure iBl] 

C FRODOSpec L2 Reduction Pipeline Flow 
Chart 

A flow chart showing the processes and intermediate data products 
of the FRODOSpec L2 reduction is shown in lCll 
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Fig.Al FRODOSpec optical schematic. The FRODOSpec fore-optics are mounted on the telescope acquisition and 
guidance (A&G) box. The spectrograph optics are bench mounted and fed from the IFU input head by a bundle of 144 
fibres. The pattern used to rearrange the fibres from a two dimensional matrix to a one dimensional pseudo-slit is shown. 
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Fig. Bl Example L2 reduction preview image. 
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Fibre tramline map generation 



frfind 
frclean 
frtrace 



Standard aperture extraction of 
flux 



frextract 



Spectral arc fitting 



frarcfit 



Sky-only fibre identification and 
subtraction 



frsubsky 
1 



[RSS_SS] 
[CUBE_SS] 
[SPEC_SS] 



T 



Wavelength rebinning to a linear 
calibration 



frrebin 




[RSS_NONSS] 
[CUBE_NONSS] 
[SPEC_NONSS] 
[COLCUBE_NONSS] 



V ^ V 



Fibre throughput correction 



frcorrectthroughput 



Reformatting of intermediate 
data products to desired output 
data product structure 



frreformat 



V 

(L2 Reduction 
End 

Fig. CI FRODOSpec L2 reduction flow chart. The processing sequence is shown with sohd arrows. Processes that 
generate an intermediate data product that is used to build the output science-ready data product are signified by a dashed 
arrow. 
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